make murmurhash3_x64_128 compatible with existing cuco data structures #501

srinivasyadav18 · 2024-06-05T17:05:37Z

Make murmurhash3_x64_128 compatible with existing cuco data structures

Moves sanitize_hash function from cuco::detail:: namespace to probing_scheme_base class as protected memeber.
Modifies the santize_hash function to handle cuda::std::array<uint64_t, 2>, which is returned from murmurhash3_x64_128 hash function.
Adds new hash_test.cu to test static_map API with all hash functions.

copy-pr-bot · 2024-06-05T17:05:41Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

PointKernel

Some minor cleanups.

Based on the offline discussions, we are not happy with the fact that sanitize_hash has to be invoked twice in CG-based probing. @sleeepyjack Any idea how to improve the situation?

include/cuco/detail/probing_scheme_base.cuh

include/cuco/detail/probing_scheme_impl.inl

include/cuco/detail/utils.cuh

PointKernel · 2024-06-06T22:46:12Z

/ok to test

sleeepyjack · 2024-06-08T00:50:20Z

I have some high-level questions/suggestions regarding this PR:

Why do we need to move the sanitize_hash function to the probing_scheme_base?
In my opinion, there is no real world use in having a size type larger than sizeof(size_t). So why not simply truncate the hash value to that max bit width and then call sanitize_hash on it?
The sanitize_hash function still has some unresolved problems (see draft PR Fix hash to size conversion #362) which I haven't been able to fix yet. Any chance we can also get this resolved in this PR? Should be a small but fun challenge @srinivasyadav18 ;)
nit: GH regularly gets confused and loses the discussion history when rebasing+force-pushing to the PR branch. We're fine with just adding or merging on top of the current HEAD...we will squash-merge the branch anyway so the intermediate commits will not be visible in the default branch.

PointKernel · 2024-06-08T01:12:54Z

Why do we need to move the sanitize_hash function to the probing_scheme_base?

I suggested to do so since probing is the only place using sanitize_hash.

So why not simply truncate the hash value to that max bit width and then call sanitize_hash on it?

It works but technically we are not using the hash value in a proper way, is it fair to say so?

sleeepyjack · 2024-06-08T01:59:38Z

I suggested to do so since probing is the only place using sanitize_hash.

And I agree that it makes sense to put utils with the class where they have one-time use. However, since it's a template function the syntax doesn't get much cleaner doing so: probing_scheme_base::template sanitize_hash<SizeType>(...) vs. cuco::detail::sanitize_hash<SizeType>(...).

It works but technically we are not using the hash value in a proper way, is it fair to say so?

static_cast<size_t>(some_uint128) will also just truncate the upper bits so I'd say it's valid. We may want to add a static_assert to our extent type to limit the bit width to 64.

include/cuco/detail/utils.cuh

sleeepyjack · 2024-06-13T23:54:12Z

include/cuco/detail/utils.cuh

+template <typename SizeType, typename HashType>
+__host__ __device__ constexpr SizeType sanitize_hash(HashType hash, std::uint32_t cg_rank) noexcept
+{
+  return sanitize_hash<SizeType>(sanitize_hash<SizeType>(hash) + cg_rank);


There exists a scenario where this approach fails and it's the reason CI in my initial draft PR failed.
Consider the following example:

SizeType is int32_t aka a signed type

The value of sanitize_hash<SizeType>(hash) is very close to numeric_limits<SizeType>::max()

In this scenario, if we compute sanitize_hash<SizeType>(hash) + cg_rank there's chance the result oxceeds numeric_limits<SizeType>::max() which would result in a signed integer overflow which is undefined behavior under the C++ abstract machine. Thus the compiler is free to produce any garbage code around this call.

To solve this we need check if sanitize_hash<SizeType>(hash) > (numeric_limits<SizeType>::max() - group.size()) (be careful with > and >=, I'm infamous for my off-by-one errors) and then compute the wrapped-around value manually in case this expression evaluates to true

Thanks for the detailed explanation!
I think 3ae1afe covers this case now.

PointKernel

some styling nits otheriwse good to go

include/cuco/detail/utils.cuh

include/cuco/detail/probing_scheme_base.cuh

PointKernel · 2024-06-18T17:42:03Z

/ok to test

PointKernel

Great work as always! Thanks!

include/cuco/detail/utils.cuh

sleeepyjack · 2024-06-19T01:20:54Z

include/cuco/detail/probing_scheme_impl.inl

  return detail::probing_iterator<Extent>{
-    cuco::detail::sanitize_hash<size_type>(hash_(probe_key) + g.thread_rank()) % upper_bound,
+    cuco::detail::sanitize_hash<cg_type, size_type>(g, hash_(probe_key)) % upper_bound,


I would move the size_type to the front of the tparam list so you don't have to specify the cg_type as it can be inferred from g.

It's about the intention of the API and the syntax consistency.

Sure, the g parameter should be the first one for consistency reasons but we can still use a different ordering for the tparam list, i.e., the one that lets us make use of automatic type inference. The only tparam that cannot be inferred is the result size type so specifying the CG type is redundant.

It's an internal API so I don't want to bikeshed too much. I'm okay with merging it as is.

include/cuco/detail/utils.cuh

sleeepyjack

🚀 🚀 🚀

Co-authored-by: Yunsong Wang <[email protected]>

sleeepyjack · 2024-06-19T02:21:45Z

/ok to test

srinivasyadav18 requested review from sleeepyjack and PointKernel as code owners June 5, 2024 17:05

Make murmurhash3_x64_128 compatible with existing cuco data structures

2001837

srinivasyadav18 force-pushed the murmur128_compatibility branch from 6af4877 to 2001837 Compare June 6, 2024 21:45

PointKernel requested changes Jun 6, 2024

View reviewed changes

srinivasyadav18 force-pushed the murmur128_compatibility branch from 469e39b to 2001837 Compare June 6, 2024 22:57

move back sanitize_hash to utils.cuh

56efe03

sleeepyjack requested changes Jun 13, 2024

View reviewed changes

fix sanitize_hash edge case

3ae1afe

PointKernel requested changes Jun 18, 2024

View reviewed changes

include/cuco/detail/utils.cuh Outdated Show resolved Hide resolved

include/cuco/detail/probing_scheme_base.cuh Outdated Show resolved Hide resolved

minor styling changes with CG

c203168

PointKernel requested a review from sleeepyjack June 18, 2024 17:42

PointKernel approved these changes Jun 18, 2024

View reviewed changes

PointKernel mentioned this pull request Jun 19, 2024

Add unit test for scalar vs. CGSize1 probing #509

Merged

sleeepyjack reviewed Jun 19, 2024

View reviewed changes

include/cuco/detail/utils.cuh Outdated Show resolved Hide resolved

sleeepyjack reviewed Jun 19, 2024

View reviewed changes

PointKernel reviewed Jun 19, 2024

View reviewed changes

include/cuco/detail/utils.cuh Outdated Show resolved Hide resolved

sleeepyjack approved these changes Jun 19, 2024

View reviewed changes

sleeepyjack and others added 2 commits June 19, 2024 04:12

Always use curly braces

8feddca

Co-authored-by: Yunsong Wang <[email protected]>

Re-order CG Type

6657e23

sleeepyjack merged commit c63ac89 into NVIDIA:dev Jun 19, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make murmurhash3_x64_128 compatible with existing cuco data structures #501

make murmurhash3_x64_128 compatible with existing cuco data structures #501

srinivasyadav18 commented Jun 5, 2024 •

edited

Loading

copy-pr-bot bot commented Jun 5, 2024

PointKernel left a comment •

edited

Loading

PointKernel commented Jun 6, 2024

sleeepyjack commented Jun 8, 2024 •

edited

Loading

PointKernel commented Jun 8, 2024 •

edited

Loading

sleeepyjack commented Jun 8, 2024 •

edited

Loading

sleeepyjack Jun 13, 2024 •

edited

Loading

srinivasyadav18 Jun 18, 2024

PointKernel left a comment

PointKernel commented Jun 18, 2024

PointKernel left a comment

sleeepyjack Jun 19, 2024

PointKernel Jun 19, 2024

sleeepyjack Jun 19, 2024

sleeepyjack Jun 19, 2024

sleeepyjack left a comment

sleeepyjack commented Jun 19, 2024

make murmurhash3_x64_128 compatible with existing cuco data structures #501

make murmurhash3_x64_128 compatible with existing cuco data structures #501

Conversation

srinivasyadav18 commented Jun 5, 2024 • edited Loading

copy-pr-bot bot commented Jun 5, 2024

PointKernel left a comment • edited Loading

Choose a reason for hiding this comment

PointKernel commented Jun 6, 2024

sleeepyjack commented Jun 8, 2024 • edited Loading

PointKernel commented Jun 8, 2024 • edited Loading

sleeepyjack commented Jun 8, 2024 • edited Loading

sleeepyjack Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

srinivasyadav18 Jun 18, 2024

Choose a reason for hiding this comment

PointKernel left a comment

Choose a reason for hiding this comment

PointKernel commented Jun 18, 2024

PointKernel left a comment

Choose a reason for hiding this comment

sleeepyjack Jun 19, 2024

Choose a reason for hiding this comment

PointKernel Jun 19, 2024

Choose a reason for hiding this comment

sleeepyjack Jun 19, 2024

Choose a reason for hiding this comment

sleeepyjack Jun 19, 2024

Choose a reason for hiding this comment

sleeepyjack left a comment

Choose a reason for hiding this comment

sleeepyjack commented Jun 19, 2024

srinivasyadav18 commented Jun 5, 2024 •

edited

Loading

PointKernel left a comment •

edited

Loading

sleeepyjack commented Jun 8, 2024 •

edited

Loading

PointKernel commented Jun 8, 2024 •

edited

Loading

sleeepyjack commented Jun 8, 2024 •

edited

Loading

sleeepyjack Jun 13, 2024 •

edited

Loading